library(tidyverse)1 Get started
Before you start, make sure to load the tidyverse package.
2 Let’s tidy some data sets
First, complete both tasks before you move to the extras.
1. relig_income
Have a look at the relig_income data set that is included in tidyverse package. The data set contains the results of a survey asking people about their religion and income category.
What is not tidy about this data set? Can you fix it?
2. billboard
Have a look at the billboard data set that is included in the tidyverse package. The data set contains information about the chart rank of songs in the year 2000.
What is not tidy about this data set? Can you fix it?
3 For the fast ones
- Check out the
values_drop_naandnames_prefixargument ofpivot_longer. What does it do and how can you use it with thebillboarddata? - This is a bit tricky: How would you have to change the
penguinstable if you wanted to make such a plot:
Error in `select()`:
! Can't select columns that don't exist.
✖ Column `bill_len` doesn't exist.
Hint: First use dplyr and only select the columns that you need for the plot. Then think about how to use tidyr to transform the data so it’s ready for ggplot